Generalized Statistical Methods for Mixed Exponential Families, Part I: Theoretical Foundations
نویسندگان
چکیده
This work considers the problem of learning the underlying statistical structure of multidimensional data of mixed probability distribution types (continuous and discrete) for the purpose of fitting a generative model and making decisions in a data-driven manner. Using properties of exponential family distributions and generalizing classical linear statistics techniques, a unified theoretical model called Generalized Linear Statistics (GLS) is established. The methodology exploits the split between data space and natural parameter space for exponential family distributions and solves a nonlinear problem by using classical linear statistical tools applied to data that have been mapped into the parameter space. The framework is equivalent to a computationally tractable, mixed data-type hierarchical Bayes graphical model assumption with latent variables constrained to a low-dimensional parameter subspace. We demonstrate that exponential family Principal Component Analysis, Semi-Parametric exponential family Principal Component Analysis, and Bregman soft clustering are not separate unrelated algorithms, but different manifestations of model assumptions and parameter choices taken within this common GLS framework. We readily extend these algorithms to deal with the important mixed data-type case. We study in detail the extreme case corresponding to exponential family Principal Component Analysis and solve problems related to fitting the generative model.
منابع مشابه
Estimation in Simple Step-Stress Model for the Marshall-Olkin Generalized Exponential Distribution under Type-I Censoring
This paper considers the simple step-stress model from the Marshall-Olkin generalized exponential distribution when there is time constraint on the duration of the experiment. The maximum likelihood equations for estimating the parameters assuming a cumulative exposure model with lifetimes as the distributed Marshall Olkin generalized exponential are derived. The likelihood equations do not lea...
متن کاملGeneralized Statistical Methods for Mixed Exponential Families, Part II: Applications
This work considers the problem of both supervised and unsupervised classification for vector data of mixed types. An important subclass of graphical modeling techniques called Generalized Linear Statistics (GLS) is used to capture the underlying statistical structure of these complex data. The GLS methodology exploits the split between data space and natural parameter space for exponential fam...
متن کاملGraphical Models, Exponential Families, and Variational Inference
The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building large-scale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fields, including bioinformatics, communication theory, statistical physics, combinatorial optimizati...
متن کاملAn EM Algorithm for Estimating the Parameters of the Generalized Exponential Distribution under Unified Hybrid Censored Data
The unified hybrid censoring is a mixture of generalized Type-I and Type-II hybrid censoring schemes. This article presents the statistical inferences on Generalized Exponential Distribution parameters when the data are obtained from the unified hybrid censoring scheme. It is observed that the maximum likelihood estimators can not be derived in closed form. The EM algorithm for computing the ma...
متن کاملStatistical Modeling for Oblique Collision of Nano and Micro Droplets in Plasma Spray Processes
Spreading and coating of nano and micro droplets on solid surfaces is important in a wide variety of applications including plasma spray coating, ink jet printing, DNA synthesis and etc. In spraying processes, most of droplets collide obliquely to the surface. The purpose of this article is to study the distribution of nano and micro droplets spreading when droplets impact at an oblique a...
متن کامل